Towards a unified medical microbiome ecology of the OMU for metagenomes and the OTU for microbes

Background Metagenomic sequencing technologies offered unprecedented opportunities and also challenges to microbiology and microbial ecology particularly. The technology has revolutionized the studies of microbes and enabled the high-profile human microbiome and earth microbiome projects. The terminology-change from microbes to microbiomes signals that our capability to count and classify microbes (microbiomes) has achieved the same or similar level as we can for the biomes (macrobiomes) of plants and animals (macrobes). While the traditional investigations of macrobiomes have usually been conducted through naturalists’ (Linnaeus & Darwin) naked eyes, and aerial and satellite images (remote-sensing), the large-scale investigations of microbiomes have been made possible by DNA-sequencing-based metagenomic technologies. Two major types of metagenomic sequencing technologies—amplicon sequencing and whole-genome (shotgun sequencing)—respectively generate two contrastingly different categories of metagenomic reads (data)—OTU (operational taxonomic unit) tables representing microorganisms and OMU (operational metagenomic unit), a new term coined in this article to represent various cluster units of metagenomic genes. Results The ecological science of microbiomes based on the OTU representing microbes has been unified with the classic ecology of macrobes (macrobiomes), but the unification based on OMU representing metagenomes has been rather limited. In a previous series of studies, we have demonstrated the applications of several classic ecological theories (diversity, composition, heterogeneity, and biogeography) to the studies of metagenomes. Here I push the envelope for the unification of OTU and OMU again by demonstrating the applications of metacommunity assembly and ecological networks to the metagenomes of human gut microbiomes. Specifically, the neutral theory of biodiversity (Sloan’s near neutral model), Ning et al.stochasticity framework, core-periphery network, high-salience skeleton network, special trio-motif, and positive-to-negative ratio are applied to analyze the OMU tables from whole-genome sequencing technologies, and demonstrated with seven human gut metagenome datasets from the human microbiome project. Conclusions All of the ecological theories demonstrated previously and in this article, including diversity, composition, heterogeneity, stochasticity, and complex network analyses, are equally applicable to OMU metagenomic analyses, just as to OTU analyses. Consequently, I strongly advocate the unification of OTU/OMU (microbiomes) with classic ecology of plants and animals (macrobiomes) in the context of medical ecology. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-023-05591-8.

Table S1.The parameters of Sloan (2006Sloan ( , 2007) )  *The numbers enclosed in parentheses are the percentage of each category of MG out of total number of MGs.**N is the average total gene abundance (total reads) per metagenome sample; m is the migration probability.***Total is the total number of genes in the metacommunity of metagenomes, or in the assembly of metagenomes.
Table S2.The parameters of Sloan (2006Sloan ( , 2007) )  Table S3.The parameters of Sloan (2006Sloan ( , 2007) ) neutral model fitted to metagenomic gene (MG) abundance data of the four datasets (each cohort is treated as both the source & destination communities in Sloan model) neutral model for Type-II MFGC (metagenome functional gene cluster) with each cohort being treated as both source and destination communities of Sloan model neutral model for Type-II MFGC with the healthy as source and diseased as destination community in Sloan model * Numbers inside parentheses are the percentages of MFGC in each category.** N is the average total gene abundance (total reads) per metagenome sample.*** Total is the total number of MFGCs

Table S4 (
MS-Excel Table).The lists of Type-II MFGCs in each of the three categories: below neutral, neutral and above neutral (Excel Table)for each metagenome treatment.

Table S5 .
The P-value of the randomization test for the differences (in Sloan neutral model parameters for the MFGCs, listed in TableS2) between the healthy control and diseased treatments *The test was done with the number of MFGCs, rather that with the percentage.

Table S6 .
The proportions of neutral and non-neutral MFGCs in the core and periphery, respectively, of the CPNs (core/periphery networks) of human gut metagenomes

Table S7 .
Randomization test for the proportions (see TableS6) of neutral and nonneutral MFGCs in the core vs. periphery between the healthy and diseased samples

Table S8A .
The number of various trios in the class "Trios without MAO handle" in the MFGC networks (with FDR adjustment)

Table S8B .
Randomization test for the number of various trios in the class "Trios without MAO handle" in the MFGC networks

Table S8C .
The number of "Trios with MAO handle" in MFGC networks with FDR control ∑MFGC Type-I (KEGG)

Table S8D .
Randomization test for the number of "Trios with MAO handle" in the MFGC networks

Table S9A .
The P/N (positive to negative links) ratios in the MFGC (Metagenome Functional Gene Cluster) networks with FDR adjustment

Table S9B .
Randomization test for the P/N (positive to negative links) ratios in the MFGC (Metagenome Functional Gene Cluster) networks with FDR adjustment

Table S10 (
MS-Excel Table).The list of Core/Periphery nodes from the MFGC (metagenome functional gene cluster) networks

Table S11A .
The core/periphery and nested structures in the MFGC (Metagenome Functional Gene Cluster) networks with FDR adjustment

Table S11B .
Randomization test for the core/periphery and nested structures in the MFGC (Metagenome Functional Gene Cluster) with FDR adjustment

Table S12 .
The shared MFGC analysis between the healthy and diseased treatments Algorithm A1 is randomization reassignments (remix) of all reads across samples and MFGCs.Algorithm A2 is randomization reassignments (remix) of all samples only.See Ma et al. (2019) ISME Journal or Ma (2020) for detailed introduction of A1 & A2.

Table S13A .
Statistical properties of the high-salience skeletons in the MFGC (Metagenome Functional Gene Cluster) networks with FDR adjustment

Table S13B .
Randomization test for the statistical properties of the high-salience skeletons in the MFGC (Metagenome Functional Gene Cluster) networks with FDR adjustment

Table S15A .
The number of various trios in the class "Trios without MAO handle" in the MF/MP (Metagenome Functions/Pathways) networks with FDR control

Table S15B .
Randomization test for the number of various trios in the class "Trios without MAO handle" in the MF/MP (Metagenome Functions/Pathways) networks with FDR control

Table S15C .
The number of "Trios with MAO handle" in the MF/MP (Metagenome Functions/Pathways) networks with FDR control

Table S15D .
Randomization test for the number of "Trios with MAO handle" in the MF/MP (Metagenome Functions/Pathways) networks with FDR control

Table S16A .
The P/N (positive to negative links) ratios in the MF/MP (Metagenome Functions/Pathways) networks with FDR control

Table S16B .
Randomization test for the P/N (positive to negative links) ratios in the MF/MP (Metagenome Functions/Pathways) networks with FDR control

Table S17 (
MS-Excel Table).The list of Core/Periphery nodes from the MF/MP (metagenome function / metagenome pathway) networks of the human gut metagenomes

Table S19 .
The shared MP/MF analysis between the healthy and diseased treatments Algorithm A1 is randomization reassignments (remix) of all reads across samples and MFGCs Algorithm A2 is randomization reassignments (remix) of all samples only.See Ma et al. (2019) ISME Journal or Ma (2020) for detailed introduction of A1 & A2.*Since there were only two periphery nodes, it was not possible to perform shared periphery analysis. *

Table S20A .
Statistical properties of the high-salience skeletons in the MF/MP (Metagenome Functions/Pathways) networks with FDR control